Text Classification with Tournament Methods

نویسندگان

  • Louise Guthrie
  • Wei Liu
  • Yunqing Xia
چکیده

This paper compares the effectiveness of n-way (n > 2) classification using a probabilistic classifier to the use of multiple binary probabilistic classifiers. We describe the use of binary classifiers in both Round Robin and Elimination tournaments, and compare both tournament methods and n-way classification when determining the language of origin of speakers (both native and non-native English speakers) speaking English. We conducted hundreds of experiments by varying the number of categories as well as the categories themselves. In all experiments the tournament methods performed better than the n-way classifier, and of these tournament methods, on average, Round Robin performs slightly better than the Elimination tournament.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Quality of Email Categorization with Tournament Methods

The tournament text classification methods are proposed in this article to perform the task of email categorization, in which the essence is to break down the multi-class categorization process into a set of binary classification tasks. We implement the methods of elimination and Round Robin tournament to classify emails within 15 folders. Substantial experiments are conducted to compare the ef...

متن کامل

Email Categorization with Tournament Methods

We propose to deploy the tournament methods to perform the task of email categorization. The multi-class categorization process is thus broken down into a set of binary classification tasks. We implement the elimination tournament and Round Robin tournament and apply them to classify emails within 15 folders. We conduct substantial experiments to compare the effectiveness and robustness of the ...

متن کامل

Improving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA

With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005